This file describes the organisation and content of the data set:

"Road Traffic Time-Series Measurements of Flow and Occupancy from 10,150 Loop Detectors in 25 Cities"


Directory Structure And File Name Formats:
------------------------------------------

The data are organised according to the following directory structure and file name formats:

s2b.Loop.Detector.Locations/detectors.<DATA_SOURCE>.<COUNTRY>.<CITY>.fits
s2b.Loop.Detector.Locations/summary.statistics.of.detectors.per.city.txt
s2b.Loop.Detector.Locations/summary.statistics.of.measurements.per.city.detector.txt
s2b.Loop.Detector.Measurements.Raw/<DATA_SOURCE>/<COUNTRY>/<CITY>/<DETECTOR_ID>/measurements.raw.<DATA_SOURCE>.<COUNTRY>.<CITY>.<DETECTOR_ID>.fits

where <DATA_SOURCE> is one of "LD.Flow.LD.Occupancy" or "LD.Flow.LD.Occupancy.LD.Speed", <COUNTRY> is a country name, <CITY> is a city name, and <DETECTOR_ID> is a
loop detector ID.


Data Formats:
-------------

The data are mostly stored in FITS binary table files, along with a couple of ASCII text files. The FITS binary table files can be read in using Python. To do this,
first make sure that the Python package "astropy" is installed:

https://docs.astropy.org/en/stable/index.html
https://docs.astropy.org/en/stable/table/index.html

Then, to read in a file "FILE.fits", issue the following commands from within Python:

from astropy.table import Table
data_table = Table.read('FILE.fits', format = 'fits')


Data On The Loop Detector Locations And The Roads On Which They Are Located:
----------------------------------------------------------------------------

The data on the loop detector locations, and the corresponding (link) roads where they are located, are stored in the FITS binary table files with names of the form:

detectors.<DATA_SOURCE>.<COUNTRY>.<CITY>.fits

If <DATA_SOURCE> is the string "LD.Flow.LD.Occupancy", then flow and occupancy measurements are available for the loop detectors recorded in this file, and if
<DATA_SOURCE> is the string "LD.Flow.LD.Occupancy.LD.Speed", then flow, occupancy, and speed measurements are available for the loop detectors recorded in this file.
The strings <COUNTRY> and <CITY> indicate the country and city, respectively, that the loop detectors recorded in this file are located in.

There are 25 such files, with a total of 10,150 rows, each corresponding to a unique loop detector. In each file, the rows are sorted by "LATITUDE". The columns in
each file are as follows:

DETECTOR_ID - STRING VECTOR - Loop detector ID made up exclusively from characters in the set {'_', '-', '+', '0', ..., '9', 'a', ..., 'z', 'A', ..., 'Z'}. The values
                              in this column are all unique.
LONGITUDE - FLOAT64 VECTOR - Longitude (deg; WGS84 - World geodetic system 1984) of the loop detector. The values in this column are in the range -180.0 to 180.0
                             inclusive with no bad values.
LATITUDE - FLOAT64 VECTOR - Latitude (deg; WGS84 - World geodetic system 1984) of the loop detector. The values in this column are in the range -90.0 to 90.0 inclusive
                            with no bad values, and they are sorted into ascending order.
LENGTH - FLOAT64 VECTOR - Length (km) of the (link) road on which the loop detector is located. The values in this column are positive with no bad values.
POSITION - FLOAT64 VECTOR - Loop detector location as a ratio of the distance from the downstream intersection to the length of the (link) road on which the loop
                            detector is located. The values in this column are in the range 0.0 to 1.0 inclusive with no bad values.
ROAD_CLASS - STRING VECTOR - Classification of the (link) road on which the loop detector is located (from Open Street Maps). The values in this column are from the
                             set {'living_street', 'motorway', 'motorway_link', 'primary', 'primary_link', 'residential', 'secondary', 'secondary_link', 'service',
                             'tertiary', 'tertiary_link', 'trunk', 'trunk_link', 'unclassified'} with no bad values.
SPEED_LIMIT - FLOAT64 VECTOR - Speed limit (km/h) of the (link) road on which the loop detector is located (from Open Street Maps). Good values in this column are
                               positive, while bad values are -1.0.
NLANES - INT32 VECTOR - Number of lanes covered by the loop detector. The values in this column are positive with no bad values.
LINK_ID - INT32 VECTOR - An ID number for the corresponding link road. Good values in this column are non-negative, while bad values are -1.
LINK_PTS_LONGITUDE - VECTOR OF FLOAT64 100-ELEMENT VECTORS - Longitudes (deg; WGS84 - World geodetic system 1984) of the points mapping out the corresponding link
                                                             road. All values in the vectors in this column are in the range -180.0 to 180.0 inclusive with no bad
                                                             values.
LINK_PTS_LATITUDE - VECTOR OF FLOAT64 100-ELEMENT VECTORS - Latitudes (deg; WGS84 - World geodetic system 1984) of the points mapping out the corresponding link road.
                                                            All values in the vectors in this column are in the range -90.0 to 90.0 inclusive with no bad values.
LINK_PTS_FLAG - VECTOR OF INT32 100-ELEMENT VECTORS - Flags indicating which points map out the corresponding link road. All values in the vectors in this column are
                                                      0 (ignore) or 1 (point on the link road).


Loop Detector Measurements:
---------------------------

The data on the loop detector measurements are stored in the FITS binary table files with names of the form:

measurements.raw.<DATA_SOURCE>.<COUNTRY>.<CITY>.<DETECTOR_ID>.fits

If <DATA_SOURCE> is the string "LD.Flow.LD.Occupancy", then the file contains flow and occupancy measurements, and if <DATA_SOURCE> is the string
"LD.Flow.LD.Occupancy.LD.Speed", then the file contains flow, occupancy, and speed measurements. The strings <COUNTRY>, <CITY>, and <DETECTOR_ID> indicate the country,
city, and ID, respectively, of the loop detector that made the measurements recorded in the file.

There are 10,150 such files, with a total of 168,183,675 rows (all unique), of which 147,034,646 rows correspond to good measurements. For a specific combination of
<CITY> and <DETECTOR_ID> (i.e. for a specific file), any rows with duplicated combined <DATE> and <INTERVAL_START> entries are flagged with "ERROR_FLAG = 1". In each
file, the rows are sorted by "DATE" and then "INTERVAL_START". The columns in each file are as follows:

DATE - STRING VECTOR - Date (YYYY-MM-DD) on which the measurement was taken (local time). The values in this column are valid dates with no bad values, and they are
                       sorted into ascending order.
INTERVAL_START - INT32 VECTOR - Time at the start of the aggregation time-interval (seconds after midnight local time). The values in this column are in the range 0
                                to 86400 inclusive with no bad values. For any specific "DATE" value, the "INTERVAL_START" values are sorted into ascending order.
FLOW - FLOAT64 VECTOR - Vehicle count in the aggregation time-interval scaled to 1 hour (veh/hour; flow). Good values in this column are non-negative, while bad
                        values are -1.0 (flagged with "ERROR_FLAG = 1").
OCCUPANCY - FLOAT64 VECTOR - Fraction of time in the aggregation time-interval that the loop detector is occupied (occupancy). Good values in this column are in the
                             range 0.0 to 1.0 inclusive, while bad values are -1.0 (flagged with "ERROR_FLAG = 1"). Note that positive flow or speed measurements at
                             zero occupancy are not flagged as errors.
ERROR_FLAG - INT32 VECTOR - Flag indicating an error. The values in this column are 0 (no error) or 1 (error).
SPEED - FLOAT64 VECTOR - Mean vehicle speed (km/h) in the aggregation time-interval. Good values in this column are non-negative, while bad values are -1.0 (flagged
                         with "ERROR_FLAG = 1"). This column is only present if <DATA_SOURCE> is the string 'LD.Flow.LD.Occupancy.LD.Speed'.

Note that the full set of loop detector measurements includes 3,196,239 non-flagged flow-occupancy measurement pairs where the occupancy is zero, and that 2,758,156
of these also have zero flow.


Summary Statistics:
-------------------

The data set also includes two summary statistics files (as ASCII text files with self-explanatory column headers):

summary.statistics.of.detectors.per.city.txt - Summary statistics for the loop detectors in each city (25 data rows).
summary.statistics.of.measurements.per.city.detector.txt - Summary statistics for the measurements from each loop detector (10,150 data rows).
